Contents
<style type="text/css">
/—————————————————————————– | Copyright (c) Jupyter Development Team. | Distributed under the terms of the Modified BSD License. |—————————————————————————-/
/* The following CSS variables define the main, public API for styling JupyterLab. These variables should be used by all plugins wherever possible. In other words, plugins should not define custom colors, sizes, etc unless absolutely necessary. This enables users to change the visual theme of JupyterLab by changing these variables.
Many variables appear in an ordered sequence (0,1,2,3). These sequences
are designed to work well together, so for example, --jp-border-color1 should
be used with --jp-layout-color1. The numbers have the following meanings:
0: super-primary, reserved for special emphasis
1: primary, most important under normal situations
2: secondary, next most important under normal situations
3: tertiary, next most important under normal situations
Throughout JupyterLab, we are mostly following principles from Google’s Material Design when selecting colors. We are not, however, following all of MD as it is not optimized for dense, information rich UIs. */
:root { /* Elevation *
We style box-shadows using Material Design’s idea of elevation. These particular numbers are taken from here:
https://github.com/material-components/material-components-web
https://material-components-web.appspot.com/elevation.html */
–jp-shadow-base-lightness: 0; –jp-shadow-umbra-color: rgba( var(–jp-shadow-base-lightness), var(–jp-shadow-base-lightness), var(–jp-shadow-base-lightness), 0.2 ); –jp-shadow-penumbra-color: rgba( var(–jp-shadow-base-lightness), var(–jp-shadow-base-lightness), var(–jp-shadow-base-lightness), 0.14 ); –jp-shadow-ambient-color: rgba( var(–jp-shadow-base-lightness), var(–jp-shadow-base-lightness), var(–jp-shadow-base-lightness), 0.12 ); –jp-elevation-z0: none; –jp-elevation-z1: 0px 2px 1px -1px var(–jp-shadow-umbra-color), 0px 1px 1px 0px var(–jp-shadow-penumbra-color), 0px 1px 3px 0px var(–jp-shadow-ambient-color); –jp-elevation-z2: 0px 3px 1px -2px var(–jp-shadow-umbra-color), 0px 2px 2px 0px var(–jp-shadow-penumbra-color), 0px 1px 5px 0px var(–jp-shadow-ambient-color); –jp-elevation-z4: 0px 2px 4px -1px var(–jp-shadow-umbra-color), 0px 4px 5px 0px var(–jp-shadow-penumbra-color), 0px 1px 10px 0px var(–jp-shadow-ambient-color); –jp-elevation-z6: 0px 3px 5px -1px var(–jp-shadow-umbra-color), 0px 6px 10px 0px var(–jp-shadow-penumbra-color), 0px 1px 18px 0px var(–jp-shadow-ambient-color); –jp-elevation-z8: 0px 5px 5px -3px var(–jp-shadow-umbra-color), 0px 8px 10px 1px var(–jp-shadow-penumbra-color), 0px 3px 14px 2px var(–jp-shadow-ambient-color); –jp-elevation-z12: 0px 7px 8px -4px var(–jp-shadow-umbra-color), 0px 12px 17px 2px var(–jp-shadow-penumbra-color), 0px 5px 22px 4px var(–jp-shadow-ambient-color); –jp-elevation-z16: 0px 8px 10px -5px var(–jp-shadow-umbra-color), 0px 16px 24px 2px var(–jp-shadow-penumbra-color), 0px 6px 30px 5px var(–jp-shadow-ambient-color); –jp-elevation-z20: 0px 10px 13px -6px var(–jp-shadow-umbra-color), 0px 20px 31px 3px var(–jp-shadow-penumbra-color), 0px 8px 38px 7px var(–jp-shadow-ambient-color); –jp-elevation-z24: 0px 11px 15px -7px var(–jp-shadow-umbra-color), 0px 24px 38px 3px var(–jp-shadow-penumbra-color), 0px 9px 46px 8px var(–jp-shadow-ambient-color);
/* Borders *
The following variables, specify the visual styling of borders in JupyterLab. */
–jp-border-width: 1px; –jp-border-color0: var(–md-grey-400); –jp-border-color1: var(–md-grey-400); –jp-border-color2: var(–md-grey-300); –jp-border-color3: var(–md-grey-200); –jp-border-radius: 2px;
/* UI Fonts *
The UI font CSS variables are used for the typography all of the JupyterLab
user interface elements that are not directly user generated content.
The font sizing here is done assuming that the body font size of –jp-ui-font-size1
is applied to a parent element. When children elements, such as headings, are sized
in em all things will be computed relative to that body size. */
–jp-ui-font-scale-factor: 1.2; –jp-ui-font-size0: 0.83333em; –jp-ui-font-size1: 13px; /* Base font size */ –jp-ui-font-size2: 1.2em; –jp-ui-font-size3: 1.44em;
–jp-ui-font-family: -apple-system, BlinkMacSystemFont, ‘Segoe UI’, Helvetica, Arial, sans-serif, ‘Apple Color Emoji’, ‘Segoe UI Emoji’, ‘Segoe UI Symbol’;
/*
Use these font colors against the corresponding main layout colors.
In a light theme, these go from dark to light. */
/* Defaults use Material Design specification */ –jp-ui-font-color0: rgba(0, 0, 0, 1); –jp-ui-font-color1: rgba(0, 0, 0, 0.87); –jp-ui-font-color2: rgba(0, 0, 0, 0.54); –jp-ui-font-color3: rgba(0, 0, 0, 0.38);
/*
Use these against the brand/accent/warn/error colors.
These will typically go from light to darker, in both a dark and light theme. */
–jp-ui-inverse-font-color0: rgba(255, 255, 255, 1); –jp-ui-inverse-font-color1: rgba(255, 255, 255, 1); –jp-ui-inverse-font-color2: rgba(255, 255, 255, 0.7); –jp-ui-inverse-font-color3: rgba(255, 255, 255, 0.5);
/* Content Fonts *
Content font variables are used for typography of user generated content.
The font sizing here is done assuming that the body font size of –jp-content-font-size1
is applied to a parent element. When children elements, such as headings, are sized
in em all things will be computed relative to that body size. */
–jp-content-line-height: 1.6; –jp-content-font-scale-factor: 1.2; –jp-content-font-size0: 0.83333em; –jp-content-font-size1: 14px; /* Base font size */ –jp-content-font-size2: 1.2em; –jp-content-font-size3: 1.44em; –jp-content-font-size4: 1.728em; –jp-content-font-size5: 2.0736em;
/* This gives a magnification of about 125% in presentation mode over normal. */ –jp-content-presentation-font-size1: 17px;
–jp-content-heading-line-height: 1; –jp-content-heading-margin-top: 1.2em; –jp-content-heading-margin-bottom: 0.8em; –jp-content-heading-font-weight: 500;
/* Defaults use Material Design specification */ –jp-content-font-color0: rgba(0, 0, 0, 1); –jp-content-font-color1: rgba(0, 0, 0, 0.87); –jp-content-font-color2: rgba(0, 0, 0, 0.54); –jp-content-font-color3: rgba(0, 0, 0, 0.38);
–jp-content-link-color: var(–md-blue-700);
–jp-content-font-family: -apple-system, BlinkMacSystemFont, ‘Segoe UI’, Helvetica, Arial, sans-serif, ‘Apple Color Emoji’, ‘Segoe UI Emoji’, ‘Segoe UI Symbol’;
/*
Code Fonts
Code font variables are used for typography of code and other monospaces content. */
–jp-code-font-size: 13px; –jp-code-line-height: 1.3077; /* 17px for 13px base / –jp-code-padding: 5px; / 5px for 13px base, codemirror highlighting needs integer px value */ –jp-code-font-family-default: Menlo, Consolas, ‘DejaVu Sans Mono’, monospace; –jp-code-font-family: var(–jp-code-font-family-default);
/* This gives a magnification of about 125% in presentation mode over normal. */ –jp-code-presentation-font-size: 16px;
/* may need to tweak cursor width if you change font size */ –jp-code-cursor-width0: 1.4px; –jp-code-cursor-width1: 2px; –jp-code-cursor-width2: 4px;
/* Layout *
The following are the main layout colors use in JupyterLab. In a light
theme these would go from light to dark. */
–jp-layout-color0: white; –jp-layout-color1: white; –jp-layout-color2: var(–md-grey-200); –jp-layout-color3: var(–md-grey-400); –jp-layout-color4: var(–md-grey-600);
/* Inverse Layout *
The following are the inverse layout colors use in JupyterLab. In a light
theme these would go from dark to light. */
–jp-inverse-layout-color0: #111111; –jp-inverse-layout-color1: var(–md-grey-900); –jp-inverse-layout-color2: var(–md-grey-800); –jp-inverse-layout-color3: var(–md-grey-700); –jp-inverse-layout-color4: var(–md-grey-600);
/* Brand/accent */
–jp-brand-color0: var(–md-blue-700); –jp-brand-color1: var(–md-blue-500); –jp-brand-color2: var(–md-blue-300); –jp-brand-color3: var(–md-blue-100); –jp-brand-color4: var(–md-blue-50);
–jp-accent-color0: var(–md-green-700); –jp-accent-color1: var(–md-green-500); –jp-accent-color2: var(–md-green-300); –jp-accent-color3: var(–md-green-100);
/* State colors (warn, error, success, info) */
–jp-warn-color0: var(–md-orange-700); –jp-warn-color1: var(–md-orange-500); –jp-warn-color2: var(–md-orange-300); –jp-warn-color3: var(–md-orange-100);
–jp-error-color0: var(–md-red-700); –jp-error-color1: var(–md-red-500); –jp-error-color2: var(–md-red-300); –jp-error-color3: var(–md-red-100);
–jp-success-color0: var(–md-green-700); –jp-success-color1: var(–md-green-500); –jp-success-color2: var(–md-green-300); –jp-success-color3: var(–md-green-100);
–jp-info-color0: var(–md-cyan-700); –jp-info-color1: var(–md-cyan-500); –jp-info-color2: var(–md-cyan-300); –jp-info-color3: var(–md-cyan-100);
/* Cell specific styles */
–jp-cell-padding: 5px;
–jp-cell-collapser-width: 8px; –jp-cell-collapser-min-height: 20px; –jp-cell-collapser-not-active-hover-opacity: 0.6;
–jp-cell-editor-background: var(–md-grey-100); –jp-cell-editor-border-color: var(–md-grey-300); –jp-cell-editor-box-shadow: inset 0 0 2px var(–md-blue-300); –jp-cell-editor-active-background: var(–jp-layout-color0); –jp-cell-editor-active-border-color: var(–jp-brand-color1);
–jp-cell-prompt-width: 64px; –jp-cell-prompt-font-family: var(–jp-code-font-family-default); –jp-cell-prompt-letter-spacing: 0px; –jp-cell-prompt-opacity: 1; –jp-cell-prompt-not-active-opacity: 0.5; –jp-cell-prompt-not-active-font-color: var(–md-grey-700); /* A custom blend of MD grey and blue 600
See https://meyerweb.com/eric/tools/color-blend/#546E7A:1E88E5:5:hex / –jp-cell-inprompt-font-color: #307fc1; / A custom blend of MD grey and orange 600
https://meyerweb.com/eric/tools/color-blend/#546E7A:F4511E:5:hex */ –jp-cell-outprompt-font-color: #bf5b3d;
/* Notebook specific styles */
–jp-notebook-padding: 10px; –jp-notebook-select-background: var(–jp-layout-color1); –jp-notebook-multiselected-color: var(–md-blue-50);
/* The scroll padding is calculated to fill enough space at the bottom of the notebook to show one single-line cell (with appropriate padding) at the top when the notebook is scrolled all the way to the bottom. We also subtract one pixel so that no scrollbar appears if we have just one single-line cell in the notebook. This padding is to enable a ‘scroll past end’ feature in a notebook. */ –jp-notebook-scroll-padding: calc( 100% - var(–jp-code-font-size) * var(–jp-code-line-height) - var(–jp-code-padding) - var(–jp-cell-padding) - 1px );
/* Rendermime styles */
–jp-rendermime-error-background: #fdd; –jp-rendermime-table-row-background: var(–md-grey-100); –jp-rendermime-table-row-hover-background: var(–md-light-blue-50);
/* Dialog specific styles */
–jp-dialog-background: rgba(0, 0, 0, 0.25);
/* Console specific styles */
–jp-console-padding: 10px;
/* Toolbar specific styles */
–jp-toolbar-border-color: var(–jp-border-color1); –jp-toolbar-micro-height: 8px; –jp-toolbar-background: var(–jp-layout-color1); –jp-toolbar-box-shadow: 0px 0px 2px 0px rgba(0, 0, 0, 0.24); –jp-toolbar-header-margin: 4px 4px 0px 4px; –jp-toolbar-active-background: var(–md-grey-300);
/* Input field styles */
–jp-input-box-shadow: inset 0 0 2px var(–md-blue-300); –jp-input-active-background: var(–jp-layout-color1); –jp-input-hover-background: var(–jp-layout-color1); –jp-input-background: var(–md-grey-100); –jp-input-border-color: var(–jp-border-color1); –jp-input-active-border-color: var(–jp-brand-color1); –jp-input-active-box-shadow-color: rgba(19, 124, 189, 0.3);
/* General editor styles */
–jp-editor-selected-background: #d9d9d9; –jp-editor-selected-focused-background: #d7d4f0; –jp-editor-cursor-color: var(–jp-ui-font-color0);
/* Code mirror specific styles */
–jp-mirror-editor-keyword-color: #008000; –jp-mirror-editor-atom-color: #88f; –jp-mirror-editor-number-color: #080; –jp-mirror-editor-def-color: #00f; –jp-mirror-editor-variable-color: var(–md-grey-900); –jp-mirror-editor-variable-2-color: #05a; –jp-mirror-editor-variable-3-color: #085; –jp-mirror-editor-punctuation-color: #05a; –jp-mirror-editor-property-color: #05a; –jp-mirror-editor-operator-color: #aa22ff; –jp-mirror-editor-comment-color: #408080; –jp-mirror-editor-string-color: #ba2121; –jp-mirror-editor-string-2-color: #708; –jp-mirror-editor-meta-color: #aa22ff; –jp-mirror-editor-qualifier-color: #555; –jp-mirror-editor-builtin-color: #008000; –jp-mirror-editor-bracket-color: #997; –jp-mirror-editor-tag-color: #170; –jp-mirror-editor-attribute-color: #00c; –jp-mirror-editor-header-color: blue; –jp-mirror-editor-quote-color: #090; –jp-mirror-editor-link-color: #00c; –jp-mirror-editor-error-color: #f00; –jp-mirror-editor-hr-color: #999;
/* Vega extension styles */
–jp-vega-background: white;
/* Sidebar-related styles */
–jp-sidebar-min-width: 250px;
/* Search-related styles */
–jp-search-toggle-off-opacity: 0.5; –jp-search-toggle-hover-opacity: 0.8; –jp-search-toggle-on-opacity: 1; –jp-search-selected-match-background-color: rgb(245, 200, 0); –jp-search-selected-match-color: black; –jp-search-unselected-match-background-color: var( –jp-inverse-layout-color0 ); –jp-search-unselected-match-color: var(–jp-ui-inverse-font-color0);
/* Icon colors that work well with light or dark backgrounds */ –jp-icon-contrast-color0: var(–md-purple-600); –jp-icon-contrast-color1: var(–md-green-600); –jp-icon-contrast-color2: var(–md-pink-600); –jp-icon-contrast-color3: var(–md-blue-600); }
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.7/latest.js?config=TeX-MML-AM_CHTML-full,Safe"> </script>
<!-- MathJax configuration -->
<script type="text/x-mathjax-config">
init_mathjax = function() {
if (window.MathJax) {
// MathJax loaded
MathJax.Hub.Config({
TeX: {
equationNumbers: {
autoNumber: "AMS",
useLabelIds: true
}
},
tex2jax: {
inlineMath: [ ['$','$'], ["\\(","\\)"] ],
displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
processEscapes: true,
processEnvironments: true
},
displayAlign: 'center',
CommonHTML: {
linebreaks: {
automatic: true
}
},
"HTML-CSS": {
linebreaks: {
automatic: true
}
}
});
MathJax.Hub.Queue(["Typeset", MathJax.Hub]);
}
}
init_mathjax();
</script>
<!-- End of mathjax configuration --></head>
[READ THIS] Before you start¶
Welcome to UpLevel mini-projects! In this series, you're challenged to independently work (with our guidance) with data that you will collect to UpLevel yourself.
We hope you're excited to embark on this adventure.
Warning:¶
This isn't just any coding course or programmes, where you receive helper code as you move from one code block to another.
This is when things get real.
In this project, you will receive instructions to execute a task along with an intended outcome. Most importantly, we will provide you with directions you can go to, to pick up code independently and implement it in this notebook.
Don't worry though, we'll be dropping lots of resources you can consult and these readings will contain everything you need to succeed. You can also perform independent research to find answers independently. You just have to read closely and pick out the parts that make most sense.
We make you do this not because we're lazy bastards but because being able to indepedently find code is a highly underrated skill and that's something all companies look out for.
If you're really stuck and are on the verge of giving up, we gotchu fam. Head on over to https://www.facebook.com/UpLevelSG/ and post your questions there.
What we'll be doing:¶
In this project, we will do the following:
- Call the taxi availability API from Data.gov.sg to collect taxi data (Part I)
- Perform data cleaning (Part II)
- Perform exploratory data analysis (Part III)
- Train a machine learning model to (Part IV)
Expectations:¶
We're not going to sugarcoat it - it'll be challenging at times. You have to promise to put in the time and effort to UpLevel yourself.
But we promise you that it'll ultimately be fun and rewarding, and you'll come out of it stronger and more confident than before.
Introduction¶
In Singapore, all of the taxis are connected to a central system that tracks their positions at all times. It's even cooler because the Singaporean government collects these data and anyone - including you - can obtain the data for analysis.
We will collect the data for one month in 2019, and perform analysis, followed by modelling.
In this notebook, you will do the following:
- Import your pandas library
- Call the Taxi Availability API from Data.gov.sg
- Organize the JSON data
- Export your DataFrame as a CSV file
Step 1: Import the following library¶
- pandas
- requests
Step 2: Visit the API website¶
Head on to https://data.gov.sg/dataset/taxi-availability to check out the API.
When you load the page (accurate as of Jun 2020), there are a few things to take note of in this page:
- The command to use for your API call (more details later)
- The API URL for the latest taxi availability
- A button that lets you try to call the API in the browser
Notice that there is a "Parameter" section as well, but we'll come to that later.
Step 3: Click on "Try it out"¶
Let's give the API a spin. In the page, try the following:
- Click "Try it out"
- Click "Execute"
The window below expands, exposing three new things:
- A text box where you can fill in with a string containing time
- A new section, one in particular titled "Request URL"
- The response of the API
One thing you can also try is entering the "Request URL" in your browser and see what you get!
Step 4: Add a date into the text box containing time¶
This is important because currently whenever you execute Step 3, you're getting the latest taxi availability. However, for this project we'll be collecting only the entire month of January in 2019.
- YYYY = year in four numbers
- MM = zero-padded months, e.g., 01 vs 12 for Jan and Dec respectively
- DD = zero-padded days, e.g., 01 to 31
- HH = hour in 24-hour format, e.g., 00 t0 23
- mm = zero-padded minutes, e.g., 00 t0 59
- ss = zero-padded seconds, e.g., 00 to 59
Note: The [T] in the middle is telling you to put the string T in between the date and the time.
Once you execute this, we will be using the URL for our API call using Python in Step 5.
Step 5: Test with one API call first with requests¶
Now that we've seen how it's done on the browser, we will be using Python to make an API call.
If this is the first time you're making an API call, or you're unsure on how to make an API call, here's a handy resource: https://www.dataquest.io/blog/python-api-tutorial/
Here are what you need to do:
- use requests to get the response of the URL that you found from Step 4
- save the response in a variable
- use .json() to get the JSON data
This is the expected output of the API call if you did it correctly¶
| type | geometry.type | geometry.coordinates | properties.timestamp | properties.taxi_count | properties.api_info.status | |
|---|---|---|---|---|---|---|
| 0 | Feature | MultiPoint | [[103.6267, 1.307992], [103.63226, 1.30884], [... | 2018-12-31T23:59:44+08:00 | 5887 | healthy |
Step 6: Turn the JSON response to a DataFrame¶
We'll practise turning a JSON response into a DataFrame directly first.
Hint: Google "turn json response to dataframe"
Hint 2: Pandas has something useful for you
Step 7: Get the JSON's "features" only¶
Wait a minute, where are the coordinates? Turns out we turned the JSON object too directly into a DataFrame. As such, we dig in deeper and get only the values from "features".
Think of JSON as a huge dictionary, and take only the values from "features".
This is what you'll see if you turn only the 'features' part of the JSON into a DataFrame!
Hint: Google on ways to get values from dictonaries using keys
Step 8: Dissect the API call to get a pattern¶
Okay, now that we're successful in turning the JSON into a DataFrame containing one row, we can now proceed with calling the rest of the month of January 2019.
We want to be granular, but not too granular so we will be getting 5-min interval data. For example:
- Starts at 2019-01-01T00:00:00
- Next one is 2019-01-01T00:05:00
- We go on until 2019-01-31T00:00:00
Of course, we will have to automate it - it looks so tedious if we did it manually.
Do you see a pattern? There are two things to take note of:
- each request URL has a part of a string that doesn't change (base URL)
- each request URL has a string that changes (the datetime)
Also, in the request URL, the hours, minutes, and seconds are separated with a string "%3A". We will need to take that into account later on.
Step 9:¶
We are going to create a list containing all of the possible combinations of the date and time in 5-min intervals between 2019-01-01 and 2019-01-31.
This is one possible way to do it, as long as you've a set of datetimes in 5-minute intervals it's fine.
Hint: Google "generate interval of dates pandas"
DatetimeIndex(['2019-01-01 00:00:00', '2019-01-01 00:05:00',
'2019-01-01 00:10:00', '2019-01-01 00:15:00',
'2019-01-01 00:20:00', '2019-01-01 00:25:00',
'2019-01-01 00:30:00', '2019-01-01 00:35:00',
'2019-01-01 00:40:00', '2019-01-01 00:45:00',
...
'2019-01-31 23:15:00', '2019-01-31 23:20:00',
'2019-01-31 23:25:00', '2019-01-31 23:30:00',
'2019-01-31 23:35:00', '2019-01-31 23:40:00',
'2019-01-31 23:45:00', '2019-01-31 23:50:00',
'2019-01-31 23:55:00', '2019-02-01 00:00:00'],
dtype='datetime64[ns]', length=8929, freq='5T')
array([['2019-01-01', '00:00:00', '2019-01-01', ..., '2019-01-16',
'11:55:00', '2019-01-16'],
['12:00:00', '2019-01-16', '12:05:00', ..., '23:55:00',
'2019-02-01', '00:00:00']], dtype='<U10')
Step 10: Generate a list of datetime in proper format for API¶
If you noticed in the list, it's still not quite suitable for using in calling the API.
You'll need the list to containing the properly formatted date and time string.
We will expect something like this. There are a few ways to do this, but here's a suggestion.
- create a list containing your date, along with a string "T" in it
- create a list containing your hour
- create a list containing your minute
- zip all of this list together with extra strings in the center
This part is a little tough, but you can do it!
Hint: Google "concatenate two lists element wise python"
['2019-01-01T00:00:00', '2019-01-01T00:05:00', '2019-01-01T00:10:00', '2019-01-01T00:15:00', '2019-01-01T00:20:00', '2019-01-01T00:25:00', '2019-01-01T00:30:00', '2019-01-01T00:35:00', '2019-01-01T00:40:00', '2019-01-01T00:45:00']
Step 11: Make your API calls for the entire duration (takes 1-2 hours)¶
Let's make the API calls! This will take a while, around 1-2 hours depending on your Internet speed.
When you run this, make sure you have a bit of time to spare. But ater you're done with the entire call, it's just a few more lines to finishing up this Part I.
This is the sequence of events:
- declare your base URL string
- declare variable containing an empty list
- use a for loop to loop through the list of strings containing dates
- in each loop, combine the base URL string with the date
- perform the API call
- get the response, extract only the feature
- turn that feature into a DataFrame
- append the DataFrame into the list you initialized
- after the entire loop, concatenate all of the DataFrames you have in the list into a combined DataFrame
You'll see some thing like this.
Hint: String concatenation is your friend
Hint 2: Google "combine dataframe in list pandas"
100%|████████████████████████████████████████████████████████████████████████████| 8928/8928 [1:11:18<00:00, 2.09it/s]
| type | geometry.type | geometry.coordinates | properties.timestamp | properties.taxi_count | properties.api_info.status | |
|---|---|---|---|---|---|---|
| 0 | Feature | MultiPoint | [[103.6267, 1.307992], [103.63226, 1.30884], [... | 2018-12-31T23:59:44+08:00 | 5887 | healthy |
| 0 | Feature | MultiPoint | [[103.63213, 1.31121], [103.63766, 1.30045], [... | 2019-01-01T00:04:44+08:00 | 4001 | healthy |
| 0 | Feature | MultiPoint | [[103.63145, 1.31125], [103.6376, 1.300248], [... | 2019-01-01T00:09:44+08:00 | 5981 | healthy |
| 0 | Feature | MultiPoint | [[103.63132, 1.3216], [103.63314, 1.32474], [1... | 2019-01-01T00:14:45+08:00 | 5461 | healthy |
| 0 | Feature | MultiPoint | [[103.628, 1.31262], [103.63714, 1.29914], [10... | 2019-01-01T00:19:45+08:00 | 5003 | healthy |
| type | geometry_type | geometry_coordinates | properties_timestamp | properties_taxi_count | properties_api_info_status | |
|---|---|---|---|---|---|---|
| 0 | Feature | MultiPoint | [[103.6267, 1.307992], [103.63226, 1.30884], [... | 2018-12-31T23:59:44+08:00 | 5887 | healthy |
| 1 | Feature | MultiPoint | [[103.63213, 1.31121], [103.63766, 1.30045], [... | 2019-01-01T00:04:44+08:00 | 4001 | healthy |
| 2 | Feature | MultiPoint | [[103.63145, 1.31125], [103.6376, 1.300248], [... | 2019-01-01T00:09:44+08:00 | 5981 | healthy |
| 3 | Feature | MultiPoint | [[103.63132, 1.3216], [103.63314, 1.32474], [1... | 2019-01-01T00:14:45+08:00 | 5461 | healthy |
| 4 | Feature | MultiPoint | [[103.628, 1.31262], [103.63714, 1.29914], [10... | 2019-01-01T00:19:45+08:00 | 5003 | healthy |
Step 12: Create a new column "time" in the new DataFrame¶
Well done! Hope that didn't take long.
Now that we've this exciting new DataFrame, we'll need to do one more thing - create a new column called time containing the date and time. Just use the list that you got from Step 9.
A few checks at the end after you're done:
- 7 columns
- 8,641 rows
DatetimeIndex(['2019-01-01 00:00:00', '2019-01-01 00:05:00',
'2019-01-01 00:10:00', '2019-01-01 00:15:00',
'2019-01-01 00:20:00', '2019-01-01 00:25:00',
'2019-01-01 00:30:00', '2019-01-01 00:35:00',
'2019-01-01 00:40:00', '2019-01-01 00:45:00',
...
'2019-01-31 23:15:00', '2019-01-31 23:20:00',
'2019-01-31 23:25:00', '2019-01-31 23:30:00',
'2019-01-31 23:35:00', '2019-01-31 23:40:00',
'2019-01-31 23:45:00', '2019-01-31 23:50:00',
'2019-01-31 23:55:00', '2019-02-01 00:00:00'],
dtype='datetime64[ns]', length=8929, freq='5T')
| type | geometry_type | geometry_coordinates | properties_timestamp | properties_taxi_count | properties_api_info_status | time | |
|---|---|---|---|---|---|---|---|
| 0 | Feature | MultiPoint | [[103.6267, 1.307992], [103.63226, 1.30884], [... | 2018-12-31T23:59:44+08:00 | 5887 | healthy | 2019-02-01 |
| 1 | Feature | MultiPoint | [[103.63213, 1.31121], [103.63766, 1.30045], [... | 2019-01-01T00:04:44+08:00 | 4001 | healthy | 2019-02-01 |
| 2 | Feature | MultiPoint | [[103.63145, 1.31125], [103.6376, 1.300248], [... | 2019-01-01T00:09:44+08:00 | 5981 | healthy | 2019-02-01 |
| 3 | Feature | MultiPoint | [[103.63132, 1.3216], [103.63314, 1.32474], [1... | 2019-01-01T00:14:45+08:00 | 5461 | healthy | 2019-02-01 |
| 4 | Feature | MultiPoint | [[103.628, 1.31262], [103.63714, 1.29914], [10... | 2019-01-01T00:19:45+08:00 | 5003 | healthy | 2019-02-01 |
| type | geometry_type | geometry_coordinates | properties_timestamp | properties_taxi_count | properties_api_info_status | time | |
|---|---|---|---|---|---|---|---|
| 0 | Feature | MultiPoint | [[103.6267, 1.307992], [103.63226, 1.30884], [... | 2018-12-31T23:59:44+08:00 | 5887 | healthy | 2019-01-01 00:00:00 |
| 1 | Feature | MultiPoint | [[103.63213, 1.31121], [103.63766, 1.30045], [... | 2019-01-01T00:04:44+08:00 | 4001 | healthy | 2019-01-01 00:05:00 |
| 2 | Feature | MultiPoint | [[103.63145, 1.31125], [103.6376, 1.300248], [... | 2019-01-01T00:09:44+08:00 | 5981 | healthy | 2019-01-01 00:10:00 |
| 3 | Feature | MultiPoint | [[103.63132, 1.3216], [103.63314, 1.32474], [1... | 2019-01-01T00:14:45+08:00 | 5461 | healthy | 2019-01-01 00:15:00 |
| 4 | Feature | MultiPoint | [[103.628, 1.31262], [103.63714, 1.29914], [10... | 2019-01-01T00:19:45+08:00 | 5003 | healthy | 2019-01-01 00:20:00 |
'2019-02-01'
Step 13: Export your DataFrame as CSV¶
Congrats and very well done! It was tough but you perservered and did it.
You've successfully called an API and collected your own data. Now it's time to export your DataFrame into a CSV format so that you can continue in Part II.
We don't need the index to be exported so don't forget the additional argument.
Hint: Google "export dataframe to csv"
What's next?¶
Head on to Part II for the next part of your adventure!